23/06/2024 - 29/06/2024

26/06/2024 21:12

I got AMC13Tool2.exe to build. I basically just replaced all instances of unordered maps with boost::unordered_maps and then it built. Here are the list of edits:

Launcher_commands_status.cc::242

      boost::unordered_map<std::string,std::string> parameters = itNode->getParameters();
      boost::unordered_map<std::string,std::string>::iterator itTable;
      boost::unordered_map<std::string,std::string> parameters = itNode->getParameters();
      boost::unordered_map<std::string,std::string>::iterator itTable;

Launcher_commands_control.cc::588

      const boost::unordered_map<std::string,std::string> params = node.getParameters();
      for( boost::unordered_map<std::string,std::string>::const_iterator it = params.begin();
      const boost::unordered_map<std::string,std::string> params = node.getParameters();
      for( boost::unordered_map<std::string,std::string>::const_iterator it = params.begin();

Status.hh::101

    std::string ParseRow(boost::unordered_map<std::string,std::string> & parameters,
             std::string const & addressBase) const;
    std::string ParseCol(boost::unordered_map<std::string,std::string> & parameters,
             std::string const & addressBase) const;
    std::string ParseRow(boost::unordered_map<std::string,std::string> & parameters,
             std::string const & addressBase) const;
    std::string ParseCol(boost::unordered_map<std::string,std::string> & parameters,
             std::string const & addressBase) const;

Status.cc::534

  std::string SparseCellMatrix::ParseCol(boost::unordered_map<std::string,std::string> & parameters,
                     std::string const & addressBase) const
  std::string SparseCellMatrix::ParseCol(boost::unordered_map<std::string,std::string> & parameters,
                     std::string const & addressBase) const

Status.cc::487

  std::string SparseCellMatrix::ParseRow(boost::unordered_map<std::string,std::string> & parameters,
                     std::string const & addressBase) const
  {
  std::string SparseCellMatrix::ParseRow(boost::unordered_map<std::string,std::string> & parameters,
                     std::string const & addressBase) const
  {

Status.cc::350

    boost::unordered_map<std::string,std::string> parameters = node.getParameters();
    boost::unordered_map<std::string,std::string> parameters = node.getParameters();

Status.cc::70

      boost::unordered_map<std::string,std::string> parameters = itNode->getParameters();      
      boost::unordered_map<std::string,std::string> parameters = itNode->getParameters();      

I also needed to add python to my C++ include path (I didn't feel like editting the make file). There were some errors with some of the python scripts having invalid syntax, but I ignored those.

 export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/opt/rh/rh-python36/root/usr/include/python3.6m
 export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/opt/rh/rh-python36/root/usr/include/python3.6m

After all these edits, I was abel to successfully run make

cd /home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/
make
cd /home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/
make

Which made these files

[root@dhcp-10-163-105-238 bin]# pwd
/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/tools/bin
[root@dhcp-10-163-105-238 bin]# ls
AMC13BenchTest.exe  AMC13Tool2.exe  AMC13ToolFlash.exe  LaTeXprint.exe
[root@dhcp-10-163-105-238 bin]#
[root@dhcp-10-163-105-238 bin]# pwd
/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/tools/bin
[root@dhcp-10-163-105-238 bin]# ls
AMC13BenchTest.exe  AMC13Tool2.exe  AMC13ToolFlash.exe  LaTeXprint.exe
[root@dhcp-10-163-105-238 bin]#

To use AMC13Tool2.exe, I had to the libraries built by the makefile to my LD_LIBRARY_PATH environment variable:

export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/amc13/lib/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/tools/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/amc13/lib/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/tools/lib:$LD_LIBRARY_PATH

Finally, I could use the tool with this command (must specify path to board files):

[root@dhcp-10-163-105-238 tools]# bin/AMC13Tool2.exe -c 192.168.1.13 -p /home/installation_testing/packages/experiment/lxedaq/address_tables/
Address table path "/home/installation_testing/packages/experiment/lxedaq/address_tables/" set on command line
use_ch false
Created URI from IP address:
  T2: ipbusudp-2.0://192.168.1.13:50001
  T1: ipbusudp-2.0://192.168.1.14:50001
Using AMC13 software ver:0
Read firmware versions 0x813f 0x2e
flavor = 5  features = 0x000000b4
>
[root@dhcp-10-163-105-238 tools]# bin/AMC13Tool2.exe -c 192.168.1.13 -p /home/installation_testing/packages/experiment/lxedaq/address_tables/
Address table path "/home/installation_testing/packages/experiment/lxedaq/address_tables/" set on command line
use_ch false
Created URI from IP address:
  T2: ipbusudp-2.0://192.168.1.13:50001
  T1: ipbusudp-2.0://192.168.1.14:50001
Using AMC13 software ver:0
Read firmware versions 0x813f 0x2e
flavor = 5  features = 0x000000b4
>

26/06/2024 21:21

https://bucms.bu.edu/twiki/bin/view/BUCMSPublic/AMC13Tool2

Here gives the useful AMC13Tool2 commands. Supposedly you can still edit the AMC13 10gBe link the same way; by editting reigster 0x1c1c

I did not actually try configuring the AMC13s with this


26/06/2024 22:02

I swapped the AMC13 we had in crate 2 into crate 1. I configured the system to return to a 1 crate setup (changed T1, T2, and 10GbE IPs to be the same as the AMC!3 that was previously in crate 1, then disabled AMC13002 in midas).

I suspect there is something wrong with the AMC13 I took out of crate 1 after doing tests at UW. I'll leave the system running at a high rate to see.


28/06/2024 16:12

Running at 5KHz, I was able to run for 6 hours.

e98f13f2785f7c77bded99e52b5ba4f8.png

21:00:13.355 2024/06/27 [MasterGM2,TALK] Alarm: CCC Run Aborted

14:55:01.448 2024/06/27 [MasterGM2,TALK] Alarm: DAQ | MasterGM2 discovered severe fill number mismatch

14:53:19.552 2024/06/27 [mhttpd,INFO] Run #246 started
21:00:13.355 2024/06/27 [MasterGM2,TALK] Alarm: CCC Run Aborted

14:55:01.448 2024/06/27 [MasterGM2,TALK] Alarm: DAQ | MasterGM2 discovered severe fill number mismatch

14:53:19.552 2024/06/27 [mhttpd,INFO] Run #246 started

Running at ~7kHz, I was able to run for 8 hours:

06:00:13.413 2024/06/27 [MasterGM2,TALK] Alarm: CCC Run Aborted
22:00:50.178 2024/06/26 [MasterGM2,TALK] Alarm: DAQ | MasterGM2 discovered severe fill number mismatch
21:59:38.197 2024/06/26 [mhttpd,INFO] Run #244 start
06:00:13.413 2024/06/27 [MasterGM2,TALK] Alarm: CCC Run Aborted
22:00:50.178 2024/06/26 [MasterGM2,TALK] Alarm: DAQ | MasterGM2 discovered severe fill number mismatch
21:59:38.197 2024/06/26 [mhttpd,INFO] Run #244 start

Both of these are much longer than any run we've had before. This is with the publisher off.

Trying to start a new run gives the normal errors:
7afc393828c272fc572408c1fdca3c93.png

So the frontends need to be reset when this happens.


28/06/2024 16:56

I tried programming the "old" AMC13 with AMC13Tool2 instead with the following steps:

export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/amc13/lib/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/tools/lib:$LD_LIBRARY_PATH
bin/AMC13Tool2.exe -c 192.168.2.13 -p /home/installation_testing/packages/experiment/lxedaq/address_tables/
export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/amc13/lib/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/home/installation_testing/packages/experiment/lxedaq/amc13/amc13_v1_2_18/tools/lib:$LD_LIBRARY_PATH
bin/AMC13Tool2.exe -c 192.168.2.13 -p /home/installation_testing/packages/experiment/lxedaq/address_tables/

Then inside the CLI:

en 1-12
daq 1
wv 0x1c1c 0xc0a83301
rd 
en 1-12
daq 1
wv 0x1c1c 0xc0a83301
rd 

and I was able to ping the 10GbE interface at 192.168.51.1, so it seemed to work.

However, whenever starting a running, immedately filled the GPU buffer when running at either 5kHz or 2kHz. This didn't seem to be the case for the 1 crate system, so I tried changing the buffer parameters to the same thing I changed them to at the UW setup:

tcp_thread.cxx::98

unsigned int TCPdatasizemax = 0x00800000;       ///< max data size 8MB
unsigned int TCPdatasizemax = 0x00800000;       ///< max data size 8MB

gpu_thread.cxx::91

int gpu_data_raw_size_max = 0x00800000; // 8MB, same as the tcp max
int gpu_data_raw_size_max = 0x00800000; // 8MB, same as the tcp max

tcp_thread.h::135

#define TCP_BUF_MAX_FILLS 256
#define TCP_BUF_MAX_FILLS 256

gpu_thread.h::45

#define GPU_BUFFER_SIZE 512
#define GPU_BUFFER_SIZE 512

But that didn't seem to to solve the problem.

I then turned off trigger throttling by commenting out these lines:

gpu_thread.cpp::539

    //Do not proceed if the GPU buffer is full
    //This is currently broken (throttles, but never returns)
    /*
    if ( (gpu_buffer_filled >= GPU_BUFFER_SIZE - 1) || (tcp_buffer_filled >= TCP_BUF_MAX_FILLS - 1) )
    {
        fc7help->setThrottleTriggers(encoder_fc7, frontend_index, 1);
        triggersThrottled = true;
        
        // Check if at least 10 minutes have passed since the last message, print warning if so.
        std::time_t current_time = std::time(nullptr); // Get the current time
        int seconds_between_messages = 600;
        if (isElapsedTime(current_time, last_msg_time, seconds_between_messages)) {
            cm_msg(MINFO, __FILE__, "Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers.");
            last_msg_time = current_time; // Update the last message time
        }
        continue;
    } 
    else if (triggersThrottled) {
        fc7help->setThrottleTriggers(encoder_fc7, frontend_index, 0);
        triggersThrottled = false;
        cm_msg(MINFO, __FILE__, "Trigger throttling removed");
    }
    */
    //Do not proceed if the GPU buffer is full
    //This is currently broken (throttles, but never returns)
    /*
    if ( (gpu_buffer_filled >= GPU_BUFFER_SIZE - 1) || (tcp_buffer_filled >= TCP_BUF_MAX_FILLS - 1) )
    {
        fc7help->setThrottleTriggers(encoder_fc7, frontend_index, 1);
        triggersThrottled = true;
        
        // Check if at least 10 minutes have passed since the last message, print warning if so.
        std::time_t current_time = std::time(nullptr); // Get the current time
        int seconds_between_messages = 600;
        if (isElapsedTime(current_time, last_msg_time, seconds_between_messages)) {
            cm_msg(MINFO, __FILE__, "Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers.");
            last_msg_time = current_time; // Update the last message time
        }
        continue;
    } 
    else if (triggersThrottled) {
        fc7help->setThrottleTriggers(encoder_fc7, frontend_index, 0);
        triggersThrottled = false;
        cm_msg(MINFO, __FILE__, "Trigger throttling removed");
    }
    */

But that didn't do anything but cause the AMC13 to crash instead of throttle.

So I power cycled the crate and reconfigured the old AMC13 using AMC13Tool (not AMC13Tool2.exe) but that didn't fix the issue either.

Somehow the act of using the 2 crate system is causing the "old" AMC13 to lag behind in a way we weren't seeing before?
144e3d1c22f5c11577c0f4c8a2258129.png

Though the ring buffer is always filling up in the "new" AMC13 (I think it's getting ahead somehow, then waiting for the "old" AMC13 to catch up):

16:50:16.281 2024/06/28 [mhttpd,INFO] Run #250 stopped

16:50:16.180 2024/06/28 [mhttpd,ERROR] [midas.cxx:4292:cm_transition_call,ERROR] cannot connect to client "Ebuilder" on host localhost, port 40454, status 503

16:50:16.180 2024/06/28 [mhttpd,ERROR] [midas.cxx:12142:rpc_client_connect,ERROR] cannot connect to "localhost" port 40454: cannot connect to host "localhost" port 40454, errno 111 (Connection refused)

16:50:16.081 2024/06/28 [AMC13001,ERROR] [frontend.cpp:2584:frontend.cpp,ERROR] TCP/GPU/Midas fill numbers do not match at the end of the run.

16:50:16.080 2024/06/28 [AMC13002,TALK] Warning: DAQ | End of Run Fill Number Mismatch from AMC13002

16:50:16.080 2024/06/28 [AMC13002,ERROR] [frontend.cpp:2584:frontend.cpp,ERROR] TCP/GPU/Midas fill numbers do not match at the end of the run.

16:49:57.973 2024/06/28 [Ebuilder,INFO] evb exit status 509, auto_restart 0

16:49:57.973 2024/06/28 [AMC13001,INFO] Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers.

16:49:57.973 2024/06/28 [AMC13001,TALK] Warning: DAQ | AMC13001 GPU Ring buffer close to full (90.039062%)

16:49:57.972 2024/06/28 [AMC13002,INFO] Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers.

16:49:51.978 2024/06/28 [mhttpd,INFO] Alarm "End of Run Master" reset

16:49:48.949 2024/06/28 [MasterGM2,TALK] Warning: DAQ | Suspect fill number mismatch. Check Event numbers!

16:49:48.949 2024/06/28 [MasterGM2,ERROR] [frontend.cpp:1284:frontend.cpp,ERROR] End of Run: checking other frontend fill numbers, time out!

16:49:48.949 2024/06/28 [MasterGM2,INFO] End of Run: DC7 Triggers Received 8710  Count triggers 3146

16:49:48.949 2024/06/28 [MasterGM2,ERROR] [frontend.cpp:1205:end_of_run,ERROR] FC7-10: Unable to Verify Run has Stopped (Run state still in progress)

16:49:14.561 2024/06/28 [mhttpd,INFO] Alarm "Frontend GPU Buffer Error" reset

16:49:13.810 2024/06/28 [mhttpd,INFO] Alarm "Frontend TCP Buffer Error" reset

16:48:16.614 2024/06/28 [mhttpd,INFO] Run #250 started
16:50:16.281 2024/06/28 [mhttpd,INFO] Run #250 stopped

16:50:16.180 2024/06/28 [mhttpd,ERROR] [midas.cxx:4292:cm_transition_call,ERROR] cannot connect to client "Ebuilder" on host localhost, port 40454, status 503

16:50:16.180 2024/06/28 [mhttpd,ERROR] [midas.cxx:12142:rpc_client_connect,ERROR] cannot connect to "localhost" port 40454: cannot connect to host "localhost" port 40454, errno 111 (Connection refused)

16:50:16.081 2024/06/28 [AMC13001,ERROR] [frontend.cpp:2584:frontend.cpp,ERROR] TCP/GPU/Midas fill numbers do not match at the end of the run.

16:50:16.080 2024/06/28 [AMC13002,TALK] Warning: DAQ | End of Run Fill Number Mismatch from AMC13002

16:50:16.080 2024/06/28 [AMC13002,ERROR] [frontend.cpp:2584:frontend.cpp,ERROR] TCP/GPU/Midas fill numbers do not match at the end of the run.

16:49:57.973 2024/06/28 [Ebuilder,INFO] evb exit status 509, auto_restart 0

16:49:57.973 2024/06/28 [AMC13001,INFO] Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers.

16:49:57.973 2024/06/28 [AMC13001,TALK] Warning: DAQ | AMC13001 GPU Ring buffer close to full (90.039062%)

16:49:57.972 2024/06/28 [AMC13002,INFO] Requesting Encoder FC7 to throttle TTC triggers to clear TCP/GPU ring buffers.

16:49:51.978 2024/06/28 [mhttpd,INFO] Alarm "End of Run Master" reset

16:49:48.949 2024/06/28 [MasterGM2,TALK] Warning: DAQ | Suspect fill number mismatch. Check Event numbers!

16:49:48.949 2024/06/28 [MasterGM2,ERROR] [frontend.cpp:1284:frontend.cpp,ERROR] End of Run: checking other frontend fill numbers, time out!

16:49:48.949 2024/06/28 [MasterGM2,INFO] End of Run: DC7 Triggers Received 8710  Count triggers 3146

16:49:48.949 2024/06/28 [MasterGM2,ERROR] [frontend.cpp:1205:end_of_run,ERROR] FC7-10: Unable to Verify Run has Stopped (Run state still in progress)

16:49:14.561 2024/06/28 [mhttpd,INFO] Alarm "Frontend GPU Buffer Error" reset

16:49:13.810 2024/06/28 [mhttpd,INFO] Alarm "Frontend TCP Buffer Error" reset

16:48:16.614 2024/06/28 [mhttpd,INFO] Run #250 started

28/06/2024 17:12

I'm now remembering the ODB mode for the master frontend only looks at AMC13001 for it's triggers (i.e. only works for a 1 crate system). I'm going to "revive" the meinberg system.

I put the meinberg back in and connected al the cables. When I tried running the 1 crate system I got constant CCC run aborts immediately from the Master. To remedy this, I had to rerun

sudo /sbin/modprobe mbgclock
sudo /sbin/modprobe mbgclock

After this, it appeared the CCC run aborts went away for the 1 crate system.

I then tried enabling the second crate and this seemed to do the trick. The GPU Buffer stopped filling immediately, and runs seemed to work for a bit.


28/06/2024 17:46